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Abstract Body 


Background / Context: 

School choice has emerged as a key demand side intervention in school reform globally. 
School vouchers act as a market based reform by allowing parents to choose any school for their 
children. Both government and privately sponsored voucher programs exist. The effectiveness of 
voucher programs is fiercely disputed in both academic and policy circles. Most reviews of the 
school voucher literature have been selective, not systematic. Prior research by Rouse & Barrow 
(2009); Anderson, Guzman, & Ringquist (2013); and Epple, Romano & Urquiola (2015), either 
did not systematically search for all the empirical evaluations of school voucher participant 
effects or relied heavily on non-experimental findings even when a large number of more 
rigorous studies were available. A thorough meta-analysis informed by a true systematic review 
of all the available randomized controlled trial (RCT or “experimental”) studies would provide 
the foundation for a greater scholarly consensus regarding the ability of school vouchers to 
improve outcomes for students. 

Purpose / Objective / Research Question / Focus of Study: 

The objective of this meta-analysis is to rigorously assess the participant effects of 
private school vouchers, or in other words, to estimate the average academic impacts that the 
offer (or use) of a voucher has on a student. This review will add to the literature by being the 
first to systematically review all Randomized Control Trials (RCTs) in an international context. 
Our meta- analytic results will focus on the RCTs because these are the “gold standard” of 
program evaluation in terms of assessing causal relationships. RCTs essentially compare a 
treatment group (those receiving the offer of a voucher) relative to a control group (those who 
did not receive the offer of a voucher). In RCTs the assignment of a voucher is random, and 
therefore the issue of selection bias is resolved in expectation. 

The majority of RCTs studying the participant effects of school vouchers have been 
conducted in the United States. While voucher systems exist in many parts of the world, only a 
small number of voucher RCTs have been conducted outside the US. Therefore, we will present 
three meta-analytic estimates of the impacts of school vouchers: (1) just in the U.S.; (2) just 
outside the U.S.; and (3) globally including the U.S. and all other countries. 

Our initial search was guided by the following research question: What is the impact of 
private school vouchers globally on the student achievement of those students offered the 
vouchers? 

Our focus throughout this study will be to see what impact, if any, school voucher 
programs, in the United States and throughout the world, have had on student test scores. If the 
findings are mixed, we shall try to determine unique patterns that are driven either by geography 
or relevant program design components. We will also compare overall outcomes for reading and 
math scores for programs within the US vs. outside the US and publically funded vs. privately 
funded programs. This can be helpful for policymakers designing future private school voucher 
programs. Reading assessments will only be included if they were in English, regardless of the 
language of the country in which they were administered. We do this to ensure commonality in 
the international reading assessments and also because the international voucher evaluations in 
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the meta-analysis come from developing countries where English language skills are highly 
valued. 


Setting: 

The RCTs included in our analysis were located in four countries: the United States of 
America, Kenya, Colombia and India. Although this study will be representative of four 
continents: North America, South America, Africa, and Asia, the majority of RTCs were 
administered within the United States. The U.S. studies covered programs in Charlotte, NC; 
Dayton, OH; Milwaukee, WI; New York City; Toledo, OH; and Washington, DC. 

Population / Participants / Subjects: 

The participants in the RCTs were children who attended private schools through a 
school voucher. The grades analyzed ranged from K to 12, although most individual RCTs 
included a shorter grade range in their analysis. The sample sizes for treatment and control 
groups as well as the overall sample sizes will be reported in our study and informed our meta- 
analysis calculations. 

Intervention / Program / Practice: 

The programs evaluated were publically or privately funded school voucher or K-12 
“scholarship” programs. Voucher programs provide tuition scholarships to eligible students that 
enable them to attend their choice of any participating private school. Most of the private schools 
that participate in voucher programs in the U.S. and other countries are relatively low-cost 
schools with per- student costs below the average amount spent in area public schools. The 
duration of studies analyzed ranged from one year to six years. The earliest program evaluated 
was administered in 1990 in Milwaukee, WI, and the latest program evaluated was administered 
in 2011 in Delhi, India. 

Research Design: 

The research design of the studies that inform the meta-analysis was random assignment 
of children to treatment and control groups. Most studies had a one-stage randomization through 
administration of a lottery while one study in Andhra Pradesh, India (Muralidharan & 
Sundararaman, 2015) was based on a two-stage randomization (randomly assign students within 
randomly assigned villages). We combine the results of the experimental studies systematically, 
using the impact estimates and variances reported in the actual studies, to generate overall 
measures of average voucher impact (Cohen’s d) along with 95% confidence intervals around 
those estimates. 

Data Collection and Analysis: 

For this meta-analysis, we identified publications from computer and networked searches 
through a variety of sources. The first search was through EBSCO, JSTOR and ProQuest 
databases available at the library at the University of Arkansas (2,737 articles). The second 
search utilized the Google Scholar search engine (6,570 articles). Additionally, we searched 
various internet sources including but not limited to the websites of the National Bureau of 
Economic Research (NBER), University of Chile, Uppsala University in Sweden, and Poverty 
Action Lab at MIT. No new RCT studies were found through this search. Last, we conducted a 
network search based on matching our list of potential sources with earlier publications on 
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school vouchers internationally and review by Dr. Patrick J. Wolf, an author of this study (no 
additional articles found). From our two primary searches (library and Google Scholar), we 
found 9,307 articles in total. 

Following the search stage, the team members excluded duplicates studies (543), studies 
whose titles and/or abstracts were not relevant to school vouchers or were reporting results not 
relevant to participant effects (8,488). Then, the team members located the full articles of all the 
remaining studies in the list and read them one-by-one. At this stage, 255 studies not relevant to 
our analysis or having non-experimental research designs were excluded. See Appendix B for 
details on the studies eliminated at each stage. 

The remaining twenty-one articles were coded in MS Excel for details on author, 
publication year, location, funding type (public/private), years of evaluation, grades analyzed, 
outcome (reading(English)/math), size of treatment and control group and overall sample size. 
Finally, some studies had multiple evaluation years for the same program. We keep only the 
studies reporting the last year for which the program evaluation results were available. The entire 
search process was performed separately by at least two team members so they could match their 
results and minimize human error. An additional six studies were excluded at the coding stage 
due to repeat coverage or insufficient information. 

As some studies did not report their findings in detail, we made necessary assumptions to 
derive accurate sample sizes for treatment and control groups. For the meta-analysis, we weighed 
each study by the inverse variance and calculated the pooled standard deviation and effect size 
using Cohen’s d. We also calculated the unbiased d and the standard error for the effect size. 
When required, the effect sizes were also calculated by correlation and t-ratio. Lastly, the grand 
effect size and lower and upper bound of the overall 95% confidence interval were also 
calculated. 

Findings / Results: 

We report effect sizes for fifteen studies for math scores and fourteen studies for reading 
scores. For math scores, the effect sizes are positive for twelve studies except for Rouse (1998), 
Bettinger (2003), and Muralidharan and Sundararaman (2015). Although the math effects are 
positive for most studies, we fail to reject the null hypothesis for all but two studies: Howell, 
Wolf, Campbell, and Peterson’s DC results (2002) and Greene, Peterson, and Du (1999). For 
reading studies, the effect sizes are positive for thirteen studies except for Krueger and Zhu 
(2004). Although the effects are positive for most studies, we fail to reject the null effect for all 
but four studies: Cowen (2008); Muralidharan and Sundararaman (2015); Wolf (2015); Howell, 
Wolf, Campbell, and Peterson’s DC results (2002). Overall global effect size from meta-analysis 
indicates null impacts on math scores [95% Cl: -0.003 to 0.057 standard deviations] and positive, 
but small impacts on reading scores [95% Cl: 0.066 to 0.127 standard deviations]. 

For math outcome measures, the effect size is null overall. Studies in the US ( d = 0.031) 
[95% Cl: -0.008 to 0.069 standard deviations] tended to be slightly more positive than studies 
outside the US (d = 0.021) [95% Cl: -0.027 to 0.069 standard deviations]. Moreover, there is a 
significant positive effect size for privately funded vouchers (d = 0.037) [95% Cl: 0.001 to 0.074 
standard deviations] and a null effect of government funded vouchers ( d = 0.005) [95% Cl: - 
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0.050 to 0.059 standard deviations]. We fail to reject the null math effects for publically funded 
vouchers as well as for vouchers in US and non-US regions considered separately. Privately 
funded programs seem to produce large gains in math scores while government funded programs 
have an overall null effect. However, this does not imply causality that the private funding is 
causing the differential effect. 

For reading outcome measures, the effect size is positive overall for studies in and 
outside of the US. The effect size is much greater for non-US regions ( d = 0.136) [95% Cl: 0.087 
to 0.185 standard deviations] than for US ( d = 0.071) [95% Cl: 0.032 to 0.111 standard 
deviations]. Additionally, the effect size for privately funded vouchers ( d = 0.102) [95% Cl: 

0.064 to 0.139 standard deviations] is more than that for publically funded vouchers (0.087) 

[95% Cl: 0.033 to 0.142 standard deviations]. Thus, for reading scores, private- as well as 
government-funded programs produced positive effects, but again privately funded programs 
seem to produce larger effects than publicly funded programs. However, we cannot be confident 
that the funding structures are causing the differential. Last, for reading scores, none of the 
confidence intervals for effect sizes (public funding, private funding, within US, non-US) 
contains zero, indicating that across all these different comparisons, we find significantly 
positive impacts of school vouchers on reading scores globally. 

Conclusions: 

This meta-analysis contributes to the field by combining and systematically evaluating 
rigorous evidence from RCT studies. This review provides a broader overview of all the rigorous 
experimental findings and will have important policy implications about the effectiveness of 
voucher programs generally. While voucher programs are growing across the globe, a meta- 
analysis of the participant effect of vouchers internationally was lacking. As the first meta- 
analysis of its type, it will help establish the baseline for future studies. 

We should emphasize that for evaluations of the same program done over multiple years, 
we chose to analyze the results for the latest year available. Voucher programs appear to be 
having overall positive effects in reading, but more RCTs outside of the US are needed to reflect 
the presence of these programs around the world. 

In terms of recommending policy, there are a couple different conclusions we can draw 
from these results. We found that in general, privately funded programs show more positive 
effects, but this could be the result of several different things. For example, it could be that 
private donors may have better planning, implementation, and oversight than government forms 
of funding. In addition, it could be that privately funded programs are more likely to have 
financial support for RCT studies when they are thought to be succeeding, and that these types of 
studies are more prevalent in the literature. 
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Appendix B. Details on Search and Exclusion Process 

Number of 

Articles 

Search 1 (University of Arkansas Library) 

Three library sources (EBSCO, JSTOR, ProQuest) 2,737 

Duplicates Removed -534 

Unique articles (EBSCO, JSTOR, ProQuest) 2,203 

Excluded Based on Title and/or Abstract -2,075 

Remaining Articles (EBSCO, JSTOR, ProQuest) 128 

Search 2 (Google Scholar) 

Number of Google Scholar Sources Initially Found 6,570 

Excluded Based on Title and Abstract -6,413 

Remaining Google Articles 157 

Duplicates Removed -9 

Remaining Articles (Google Scholar) 148 


Sum of Remaining Articles (Both Searches) 276 

Excluded Based on Full Article -255 

Excluded at Coding Stage (due to repeat coverage or 

insufficient information) -6 

Total search results (RCTs) 15 
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Appendix C. Tables and Figures 

Not included in page count. 

Table 1: Study Characteristics for Math Outcomes 


s. 

No. 

Authors 

Pub 

Year 

Program 

Evaluated 

Location 

Funding 

Duration of 
Study 

Grades 

Sample 

Size 

Outcome Measure 

Results 

Comments 

1 

Rouse 

1998 

Milwaukee 

Parental Choice 
Program (MPCP) 

Milwaukee, 
WI (USA) 

Public 

1990-1994 (5 
years) 

Kto 8 

N=3177 

(T=1589, 

C=1588) 

Math 

Positive 


2 

Greene, 

Peterson & 

Du 

1999 

Milwaukee 

Parental Choice 
Program (MPCP) 

Milwaukee, 
WI (USA) 

Public 

1990-1994 (5 
years) 

K to 8 

N= 317 
(T=237, C 
= 80) 

Math 

Positive 

3- and 4- year outcomes combined. Calculated treatment 
and control using Table 1 (p. 199). 

3 

Greene 

2000 

Charlotte 

Children’ s 

Scholarship Fund 

Charlotte 

Private 

1999-2000 (1 
year) 

2 to 8 

N=357 (T 
= 223, 
C=134) 

Math 

Null 

Used the "Instrumental w/ background controls" as this 
was the most rigorous measure (Table 3). Used the p-value 
to calculate a t-statistic in order to find the standard error of 
the effect size. The sample size was split based off of the 
total 357, split by the treatment/control ratio in Table 1. 

4 

Angrist, 

Bettinger, 

Bloom, 

King & 
Kremer 

2002 

Programa de 
Ampliacion de 
Cobertura de la 

Educacion 

Secundaria 

(PACES) 

Colombia 

Public 
(partly 
funded by 
World 
Bank) 

1995-1999 (4 
years) 

6 to 10 

N=282 

(T=157, 

T=125) 

Math 

Nun 


5a 

Howell, 
Wolf, 
Campbell 
& Peterson 

2002 

School Choice 

Scholarships 

Foundation 

NYC 

Private 

1997-1999 (2 
years) 

1 to 4 

N=1199 
(T= 600, 
C=599) 

Math 

Nun 

Year 2 results. Assumed 50/50 treatment and control split. 
Combined math and reading score, so the effect size was 
assumed the same for both subjects. 

5b 

Howell, 
Wolf, 
Campbell 
& Peterson 

2002 

Parents 

Advancing 

Choice in 

Education 

Dayton, OH 

Private 

1998-2000 (2 
years) 

Kto 12 

N=382 (T= 
191, C = 

191) 

Math 

Null 

Year 2 results. Assumed 50/50 treatment and control split. 
Combined math and reading score, so the effect size was 
assumed the same for both subjects. 

5c 

Howell, 
Wolf, 
Campbell 
& Peterson 

2002 

Washington 

Scholarship 

Fund 

Washington, 

DC 

Private 

1998-2000 (2 
years) 

Kto 8 

N= 725 
(T=363, 
C=362) 

Math 

Positive 

Year 2 results. Assumed 50/50 treatment and control split. 
Combined math and reading score, so the effect size was 
assumed the same for both subjects. 

6 

Bettinger 
& Slonim 

2003 

Children's 
Scholarship Fund 

Toledo, 

OH (USA) 

Private 

1998-2001 (4 
years) 

Kto 8 

N=360 

(T=118, 

C=242) 

Math 

Null 


7 

Krueger & 
Zhu 

2004 

New York City 
School Choice 
Program 

NYC 

Private 

1997-2000 (4 
years) 

Kto 4 

N=1801 

(T=901, 

C=900) 

Math 

Nun 

Assumed 50/50 split of T and C due to lack of detailed 
data. 

8 

Bettinger 

2005 

Programa de 
Ampliacion de 
Cobertura de la 

Educacion 

Secundaria 

(PACES) 

Colombia 

Public 
(partly 
funded by 
World 
Bank) 

1995-1999 (4 
years) 

6 to 10 

N=282 

(T=141, 

C=141) 

Math 

Nun 

Assumed 50/50 split of T and C due to lack of detailed 
data. 

9 

Cowen 

2008 

Charlotte 

Children’ s 

Scholarship Fund 

Charlotte 

Private 

1999-2000 (1 
year) 

2 to 8 

N=694 

(T=347, 

C=347) 

Math 

Nun 

Assumed 50/50 split of T and C due to lack of detailed 
data. 

10 

Kremer, 
Miguel & 
Thornton 

2009 

Girls' Scholarship 
Program 

Kenya 

Private 

2001 -2003 (3 
years) 

6 to 8 

N=3602 

(T=970, 

C=2632) 

Math (Girls) 

Null 

This was a merit-based voucher for girls only (but still 
assigned randomly within the group of eligible students) 

11 

Kisida, 

Gutmann, 

Puma, 

Eissa & 

Rizzo 

2013 

District of 

Columbia 
OpportunityS chola 
rship Program 
(OSP) 

DC 

Public 

2004-2009 (6 
years) 

Kto 12 

N= 1330 
(T=516, 
C=814) 

Math 

Nun 


12 

Muralidhara 

n & 

Sundararam 

an 

2015 

Andhra Pradesh 
(AP) School 
Choice Experiment 

Andhra 

Pradesh, 

India 

Private 

2008-2012 (4 
years) 

1 to 5 

N=4385 

(T=1675, 

C=2710) 

Math 

Null 


13 

Wolf, 

Egalite & 
Dixon 

2015 

Ensure Access to 
Better Learning 
Experiences 
(ENABLE) 

Delhi, India 

Private 

2011-2013 (2 
years) 

Kto 2 

N=1618 

(T=835, 

C=783) 

Math 

Nun 

Information provided by Dr. Wolf and Anna Egalite on 
request 
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Table 2 : Study Characteristics for Reading Outcomes 


s. 

No. 

Authors 

Pub 

Year 

Program 

Evaluated 

Location 

Funding 

Duration of 
Study 

Grades 

Sample 

Size 

Outcome Measure 

Results 

Comments/ Assumptions Made 

1 

Rouse 

1998 

Milwaukee 

Parental Choice 
Program (MPCP) 

Milwaukee, 
WI (USA) 

Public 

1990-1994 (5 
years) 

Kto 8 

N=3163 

(T=1582, 

C=1581) 

Reading 

Null 


2 

Greene, 
Peterson & 

Du 

1999 

Milwaukee 

Parental Choice 
Program (MPCP) 

Milwaukee, 
WI (USA) 

Public 

1990-1994 (5 
years) 

Kto 8 

N= 317 
(T=237, C 
= 80) 

Reading 

Nun 

3- and 4- year outcomes combined. Calculated treatment 
and control using Table 1 (p. 199). 

3 

Greene 

2000 

Charlotte 

Children’ s 

Scholarship Fund 

Charlotte 

Private 

1999-2000 (1 
year) 

2 to 8 

N=357 (T 
= 223, 
C=134) 

Reading 

Nun 

Used the "Instrumental w/ background controls" as this 
was the most rigorous measure (Table 3). Used the p-value 
to calculate a t-statistic in order to find the standard error of 
the effect size. The sample size was split based off of the 
total 357, split by the treatment/control ratio in Table 1. 

4 

Angrist, 

Bettinger, 

Bloom, 

King & 
Kremer 

2002 

Programa de 
Ampliacion de 
Cobertura de la 

Educacion 

Secundaria 

(PACES) 

Colombia 

Public 
(partly 
funded by 
World 
Bank) 

1995-1999 (4 
years) 

6 to 10 

N=283 

(T=157, 

T=126) 

Reading 

Nun 


5a 

Howell, 
Wolf, 
Campbell 
& Peterson 

2002 

School Choice 

Scholarships 

Foundation 

NYC 

Private 

1997-1999 (2 
years) 

1 to 4 

N=1199 
(T= 600, 
C=599) 

Reading 

Null 

Year 2 results. Assumed 50/50 treatment and control split. 
Combined math and reading score, so the effect size was 
assumed the same for both subjects. 

5b 

Howell, 
Wolf, 
Campbell 
& Peterson 

2002 

Parents 

Advancing 

Choice in 

Education 

Dayton, OH 

Private 

1998-2000 (2 
years) 

Kto 12 

N=382 (T= 
191, C = 

191) 

Reading 

Nun 

Year 2 results. Assumed 50/50 treatment and control split. 
Combined math and reading score, so the effect size was 
assumed the same for both subjects. 

5c 

Howell, 
Wolf, 
Campbell 
& Peterson 

2002 

Washington 

Scholarship 

Fund 

Washington, 

DC 

Private 

1998-2000 (2 
years) 

Kto 8 

N= 725 
(T=363, 
C=362) 

Reading 

Positive 

Year 2 results. Assumed 50/50 treatment and control split. 
Combined math and reading score, so the effect size was 
assumed the same for both subjects. 

6 

Krueger & 
Zhu 

2004 

New York City 
School Choice 
Program 

NYC 

Private 

1997-2000 (4 
years) 

Kto 4 

N=1801 

(T=901, 

C=900) 

Reading 

NuH 

Assumed 50/50 split of T and C due to lack of detailed 
data. 

7 

Bettinger 

2005 

Programa de 
Ampliacion de 
Cobertura de la 

Educacion 

Secundaria 

(PACES) 

Colombia 

Public 
(partly 
funded by 
World 
Bank) 

1995-1999 (4 
years) 

6 to 10 

N=283 

(T=157, 

T=126) 

Reading 

NuH 

Assumed 50/50 split of T and C due to lack of detailed 
data. 

8 

Cowen 

2008 

Charlotte 

Children’ s 
Scholarship Fund 

Charlotte 

Private 

1999-2000 (1 
year) 

2 to 8 

N=694 

(T=347, 

C=347) 

Reading 

Positive 

Assumed 50/50 split of T and C due to lack of detailed 
data. 

9 

Kremer, 
Miguel & 
Thornton 

2009 

Girls' Scholarship 
Program 

Kenya 

Private 

2001-2003 (3 
years) 

6 to 8 

N=3602 

(T=970, 

C=2632) 

Reading (Girls) 

Null 

This was a merit-based voucher for girls only (but still 
assigned randomly within the group of eligible students). 

10 

Wolf, 

Kisida, 

Gutmann, 

Puma, 

Eissa & 

2013 

District of 

Columbia 
OpportunityS chola 
rship Program 
(OSP) 

DC 

Public 

2004-2009 (6 
years) 

Kto 12 

N= 1328 
(T=855, 
C=473) 

Reading 

Positive 


11 

Muralidhara 

n & 

Sundararam 

an 

2015 

Andhra Pradesh 

(AP) School 
Choice Experiment 

Andhra 

Pradesh, 

India 

Private 

2008-2012 (4 
years) 

1 to 5 

N=4217 

(T=1607, 

C=2610) 

Reading 

Positive 


12 

Wolf, 

Egalite & 
Dixon 

2015 

Ensure Access to 
Better Learning 
Experiences 
(ENABLE) 

Delhi, India 

Private 

2011-2013 (2 
years) 

Kto 2 

N=1618 

(T=835, 

C=783) 

Reading (English) 

Positive 

Information provided by Dr. Wolf and Anna Egalite on 
request. 
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Fig 1: Effect sizes of vouchers on students’ math outcomes 
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O. 1400 (-0.0744, 0.3545) 
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Fig 2: Effect sizes of vouchers on students’ reading outcomes 



0.202 (-0.033, 0.437) 
0.1 10 (- 0 . 002 , 0 222 ) 
0 190 (-0.026, 0.406) 
0 212 (-0023, 0448) 
0 165(0.016, 0.314) 
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0 046 (-0.024, 0.116) 
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